Goto

Collaborating Authors

 semi-parametric method


PasteGAN: A Semi-Parametric Method to Generate Image from Scene Graph

Neural Information Processing Systems

Despite some exciting progress on high-quality image generation from structured (scene graphs) or free-form (sentences) descriptions, most of them only guarantee the image-level semantical consistency, i.e. the generated image matching the semantic meaning of the description. They still lack the investigations on synthesizing the images in a more controllable way, like finely manipulating the visual appearance of every object. Therefore, to generate the images with preferred objects and rich interactions, we propose a semi-parametric method, PasteGAN, for generating the image from the scene graph and the image crops, where spatial arrangements of the objects and their pair-wise relationships are defined by the scene graph and the object appearances are determined by the given object crops. To enhance the interactions of the objects in the output, we design a Crop Refining Network and an Object-Image Fuser to embed the objects as well as their relationships into one map. Multiple losses work collaboratively to guarantee the generated images highly respecting the crops and complying with the scene graphs while maintaining excellent image quality. A crop selector is also proposed to pick the most-compatible crops from our external object tank by encoding the interactions around the objects in the scene graph if the crops are not provided. Evaluated on Visual Genome and COCO-Stuff dataset, our proposed method significantly outperforms the SOTA methods on Inception Score, Diversity Score and Fre chet Inception Distance. Extensive experiments also demonstrate our method's ability to generate complex and diverse images with given objects.


Reviews: PasteGAN: A Semi-Parametric Method to Generate Image from Scene Graph

Neural Information Processing Systems

Limited novelty: The proposed approach is closely related to two lines of related work: 1) sg2im [4] which generates images from scene graph representations, and 2) semi-parametric image synthesis [3], which leverages semantic layouts and training images to generate novel images. The key difference to sg2im is the use of image crops in order to perform semi-parametric synthesis; however, in comparison to prior work on semi-parametric methods [3], as suggested by the authors (Line 82-83) the primary difference is the use of graph convolution architecture, where a similar graph convolution method has been introduced in [4]. I'd like to see more justifications from the authors regarding the technical novelty of this approach in presence of these two lines of work. Limited resolution: My concern about the limited novelty is exacerbated by the fact that the generated images are still in low-resolution (64x64) as prior work [4], even though high-resolution image crops are used to aid the image generation process. In contrast, related work [3] is able to generate images of much higher resolutions, e.g., 512x1024, using their semi-parametric method (which was not compared in the experiment).


Reviews: PasteGAN: A Semi-Parametric Method to Generate Image from Scene Graph

Neural Information Processing Systems

This submission received borderline positive reviews. While the reviewers ultimately did not reach consensus during the discussion period, one did step forward to'champion' the paper, and another was supportive of this decision. This submission is a'systems paper,' and should be evaluated as such. The paper does not focus on new algorithmic results, but rather on building a nontrivial system to achieve impressive results on an important problem, and it justifies its design decisions (e.g. with an ablation study). There is some concern about the output images being low-resolution.


PasteGAN: A Semi-Parametric Method to Generate Image from Scene Graph

Neural Information Processing Systems

Despite some exciting progress on high-quality image generation from structured (scene graphs) or free-form (sentences) descriptions, most of them only guarantee the image-level semantical consistency, i.e. the generated image matching the semantic meaning of the description. They still lack the investigations on synthesizing the images in a more controllable way, like finely manipulating the visual appearance of every object. Therefore, to generate the images with preferred objects and rich interactions, we propose a semi-parametric method, PasteGAN, for generating the image from the scene graph and the image crops, where spatial arrangements of the objects and their pair-wise relationships are defined by the scene graph and the object appearances are determined by the given object crops. To enhance the interactions of the objects in the output, we design a Crop Refining Network and an Object-Image Fuser to embed the objects as well as their relationships into one map. Multiple losses work collaboratively to guarantee the generated images highly respecting the crops and complying with the scene graphs while maintaining excellent image quality.


PasteGAN: A Semi-Parametric Method to Generate Image from Scene Graph

LI, Yikang, Ma, Tao, Bai, Yeqi, Duan, Nan, Wei, Sining, Wang, Xiaogang

Neural Information Processing Systems

Despite some exciting progress on high-quality image generation from structured (scene graphs) or free-form (sentences) descriptions, most of them only guarantee the image-level semantical consistency, i.e. the generated image matching the semantic meaning of the description. They still lack the investigations on synthesizing the images in a more controllable way, like finely manipulating the visual appearance of every object. Therefore, to generate the images with preferred objects and rich interactions, we propose a semi-parametric method, PasteGAN, for generating the image from the scene graph and the image crops, where spatial arrangements of the objects and their pair-wise relationships are defined by the scene graph and the object appearances are determined by the given object crops. To enhance the interactions of the objects in the output, we design a Crop Refining Network and an Object-Image Fuser to embed the objects as well as their relationships into one map. Multiple losses work collaboratively to guarantee the generated images highly respecting the crops and complying with the scene graphs while maintaining excellent image quality.


Comparing Semi-Parametric Model Learning Algorithms for Dynamic Model Estimation in Robotics

Riedel, Sebastian, Stulp, Freek

arXiv.org Machine Learning

Physical modeling of robotic system behavior is the foundation for controlling many robotic mechanisms to a satisfactory degree. Mechanisms are also typically designed in a way that good model accuracy can be achieved with relatively simple models and model identification strategies. If the modeling accuracy using physically based models is not enough or too complex, model-free methods based on machine learning techniques can help. Of particular interest to us was therefore the question to what degree semi-parametric modeling techniques, meaning combinations of physical models with machine learning, increase the modeling accuracy of inverse dynamics models which are typically used in robot control. To this end, we evaluated semi-parametric Gaussian process regression and a novel model-based neural network architecture, and compared their modeling accuracy to a series of naive semi-parametric, parametric-only and non-parametric-only regression methods. The comparison has been carried out on three test scenarios, one involving a real test-bed and two involving simulated scenarios, with the most complex scenario targeting the modeling a simulated robot's inverse dynamics model. We found that in all but one case, semi-parametric Gaussian process regression yields the most accurate models, also with little tuning required for the training procedure.